Chapter 1 - Statistical preliminaries: computer exercises

Author

Prof. Richard Wilkinson

Task 1

Finding the eigenvalues and eigenvectors of a matrix is easy in R.

A=matrix(c(3,1,1,6),nrow=2,byrow=TRUE)    # use a to define a matrix A
Eig=eigen(A)                     # the eigenvalues and eigenvectors of A
                                   # are stored in the list Eig
lambda=Eig$values                # extract the eigenvalues from Eig and
                                   # store in the vector e
lambda                           # you should see the eigenvalues in
[1] 6.302776 2.697224
                                   # descending order
Q=Eig$vectors                    # extract the eigenvectors from Eig and
                                   # store then in the columns of Q

The spectral decomposition of \(\mathbf A\) is \[\mathbf A= \mathbf Q\boldsymbol \Lambda\mathbf Q^\top\] Let’s check this in R (noting as always that there may be some numerical errors)

Q%*%diag(lambda)%*%t(Q)          # reconstruct A,
     [,1] [,2]
[1,]    3    1
[2,]    1    6
                                   # where t(Q) gives the transpose of Q

Since A is positive definite, we can calculate the symmetric, positive definite square root of A.

Asqrt=Q%*%diag(lambda**0.5)%*%t(Q) # lambda**0.5 contains the square roots
Asqrt%*%Asqrt                      # it is seen that A is recovered
     [,1] [,2]
[1,]    3    1
[2,]    1    6
  • Instead of using the full eigendecomposition for \(\mathbf A\), try truncating it and using just a single eigenvalue and eigenvector, i.e., compute \[\mathbf A' = \lambda_1 \mathbf q_1 \mathbf q_1^\top\]
Atrunc <- lambda[1]*Q[,1]%*%t(Q[,1])     
Atrunc
          [,1]     [,2]
[1,] 0.5292747 1.748075
[2,] 1.7480754 5.773501
  • Compute the difference between \(\mathbf A\) and \(\mathbf A'\) using the 2-norm and the Frobenius norm.

Let’s find the error in our low rank approximation

D = A-Atrunc

To reduce this to a single number we can look at the 2-norm

svd(D)$d[1]
[1] 2.697224

or the Frobenius norm

sqrt(sum(diag(t(D)%*%D)))  ## trace(D^T D)^0.5
[1] 2.697224
sqrt(sum(svd(D)$d^2))      ## (sum sigma_i^2)^0.5
[1] 2.697224

Note that the Frobenius norm = 2-norm in this case as D is a rank 1 matrix

Task 2

The singular value decomposition can be computed in R using the command svd. Let \(\mathbf X\) be the four numerical variables in the iris dataset with the column mean removed

n=150
H=diag(rep(1,n))-rep(1,n)%*%t(rep(1,n))/n   # calculate the centering matrix H
X=H%*% as.matrix(iris[,1:4])
# This can also be done using the command
# sweep(iris[,1:4], 2, colMeans(iris[,1:4]))  # do you understand why?
  • Compute the SVD of \(\mathbf X\) in R and report its singular values. ::: {.callout-note appearance=“minimal” collapse=“true”} ### Solution
X.svd <-svd(X)  # the compact svd
X.svd
$d
[1] 25.099960  6.013147  3.413681  1.884524

$u
               [,1]         [,2]         [,3]          [,4]
  [1,] -0.106937444 -0.053116484  0.008177340  1.200535e-03
  [2,] -0.108133305  0.029435704  0.061653182  5.254726e-02
  [3,] -0.115099407  0.024105417 -0.005243682  1.059599e-02
  [4,] -0.109376382  0.052933840 -0.009244970 -4.010341e-02
  [5,] -0.108713978 -0.054340014 -0.026387718 -3.250614e-02
  [6,] -0.090871045 -0.123284929 -0.049412255 -1.284190e-02
  [7,] -0.112372199  0.014877630 -0.075546656 -2.554657e-02
  [8,] -0.104627455 -0.027171288  0.006409304 -2.403678e-02
  [9,] -0.114995509  0.096174552 -0.006081287 -1.419178e-02
 [10,] -0.106484463  0.018920914  0.057894322 -2.987249e-02
 [11,] -0.099878528 -0.107276416  0.022063578 -7.970144e-03
 [12,] -0.104093998 -0.002449622 -0.029923789 -8.298077e-02
 [13,] -0.111000544  0.039099657  0.060592789 -4.185626e-03
 [14,] -0.128438599  0.085046076 -0.017957061 -1.150413e-02
 [15,] -0.105368707 -0.196031223  0.044417607  8.448274e-02
 [16,] -0.095061466 -0.222522790 -0.081371672  3.476500e-03
 [17,] -0.104523188 -0.134817835 -0.040479249  8.900644e-02
 [18,] -0.105509995 -0.051861218 -0.007812188  4.119247e-02
 [19,] -0.087642382 -0.145155105  0.035242173  1.435475e-02
 [20,] -0.103107190 -0.085406240 -0.062590850 -3.516679e-02
 [21,] -0.092042225 -0.065081714  0.070142485 -7.997136e-03
 [22,] -0.101342998 -0.072008224 -0.061065241  2.179087e-02
 [23,] -0.128125278 -0.022196042 -0.085654395  2.378386e-03
 [24,] -0.091742502 -0.016415506 -0.011460726  7.872154e-02
 [25,] -0.093854891  0.006200058 -0.036623544 -1.593670e-01
 [26,] -0.099867445  0.024282937  0.074236599  1.836404e-02
 [27,] -0.098359520 -0.021777529 -0.027803004  3.048501e-02
 [28,] -0.102084619 -0.061152477  0.022994010 -7.520597e-03
 [29,] -0.105160911 -0.051892954  0.042742398  3.490721e-02
 [30,] -0.104860300  0.032755097 -0.011943437 -6.579027e-02
 [31,] -0.103083767  0.033978627  0.022621621 -3.208359e-02
 [32,] -0.096013398 -0.068337634  0.042629932  1.229109e-01
 [33,] -0.105532530 -0.135264242 -0.066107282 -1.493069e-01
 [34,] -0.103535492 -0.181792610 -0.046228933 -5.059574e-02
 [35,] -0.105057013  0.020176181  0.041904793  1.011945e-02
 [36,] -0.114193074 -0.011535468  0.048139332  8.628093e-02
 [37,] -0.104591322 -0.099676589  0.078610276  9.362643e-02
 [38,] -0.111581216 -0.044676061 -0.027448110 -8.923903e-02
 [39,] -0.118745288  0.081148576 -0.021363172 -5.695417e-03
 [40,] -0.103187665 -0.038090508  0.023459225 -7.295824e-03
 [41,] -0.110362820 -0.043825224 -0.022628858  4.991360e-02
 [42,] -0.113520844  0.156483870  0.102303178  1.697976e-01
 [43,] -0.119418776  0.056863076 -0.056393445 -3.962687e-02
 [44,] -0.095841366 -0.031409745 -0.077297197  9.350315e-02
 [45,] -0.088027598 -0.072618067 -0.087513384 -9.702320e-02
 [46,] -0.108145646  0.041610190  0.028613732  7.579824e-02
 [47,] -0.101121604 -0.083778280 -0.048834573 -1.006208e-01
 [48,] -0.113126161  0.037907863 -0.024526855 -3.160705e-02
 [49,] -0.101318317 -0.096357196  0.005013657 -2.471110e-02
 [50,] -0.107703747 -0.017911765  0.026157693  1.839104e-02
 [51,]  0.051188355 -0.113943735  0.119099608  9.830224e-03
 [52,]  0.037150996 -0.052939604  0.005277057  3.006129e-04
 [53,]  0.058338830 -0.083860046  0.099108792 -8.772381e-04
 [54,]  0.007303506  0.137691455  0.052609313  4.965013e-02
 [55,]  0.043350796 -0.012404598  0.090154273  5.944239e-02
 [56,]  0.025564546  0.069555400 -0.012032787 -1.290070e-01
 [57,]  0.043627984 -0.047141414 -0.049744032 -4.433830e-02
 [58,] -0.029845572  0.167115638 -0.003604004 -9.502520e-03
 [59,]  0.041598943 -0.037977100  0.121668114 -2.076626e-02
 [60,] -0.000348423  0.120250155 -0.082357274 -2.981612e-03
 [61,] -0.020232736  0.210533870  0.079039960  2.417717e-02
 [62,]  0.020386429  0.017292314 -0.038242520  2.691356e-02
 [63,]  0.010556850  0.091472307  0.203342639  3.034482e-02
 [64,]  0.039240481  0.020757491  0.018195729 -8.994117e-02
 [65,] -0.006929309  0.042382831 -0.026498580  6.644507e-02
 [66,]  0.036966623 -0.077693006  0.092164736  5.295935e-02
 [67,]  0.026306167  0.058699653 -0.096092038 -9.969556e-02
 [68,]  0.009406588  0.055480225  0.079433862 -1.134278e-01
 [69,]  0.037638853  0.090326333  0.146328582  1.364760e-01
 [70,]  0.001801874  0.097092976  0.068841268 -2.206215e-02
 [71,]  0.044473503  0.014071974 -0.134640888 -3.981460e-02
 [72,]  0.014258525  0.011462389  0.067333155  6.526722e-02
 [73,]  0.051720555  0.054511770  0.101900086  4.714033e-04
 [74,]  0.036722326  0.030389708  0.067689922 -1.529593e-01
 [75,]  0.028480257 -0.024788340  0.094268027  2.213809e-02
 [76,]  0.035863577 -0.054631036  0.092629952  5.318412e-02
 [77,]  0.053068786 -0.040651070  0.152827061  1.874914e-02
 [78,]  0.062063929 -0.044485097  0.048311779  3.712837e-02
 [79,]  0.032402069  0.027165524 -0.010377217 -1.576602e-02
 [80,] -0.012174672  0.061242834  0.093298587  3.956948e-02
 [81,] -0.002714207  0.117271719  0.071539736  3.624705e-03
 [82,] -0.007554692  0.113133226  0.089762515 -1.090514e-02
 [83,]  0.005435415  0.052224304  0.051921309  1.748023e-02
 [84,]  0.054981220  0.070005649 -0.004735981 -9.461514e-02
 [85,]  0.023426588  0.080538092 -0.130191880 -1.331775e-01
 [86,]  0.032145800 -0.032292958 -0.113942429 -6.060272e-02
 [87,]  0.048633180 -0.067788060  0.069475453  1.656503e-02
 [88,]  0.032473965  0.061870604  0.180075674  8.172941e-02
 [89,]  0.009799126  0.044656214 -0.055179975 -7.783109e-02
 [90,]  0.006630019  0.113405956  0.017579039  1.571868e-02
 [91,]  0.018517969  0.111540846  0.007120425 -1.430873e-01
 [92,]  0.035490701  0.005731515  0.002913844 -8.144481e-02
 [93,]  0.009185195  0.067250280  0.067203194  8.983875e-03
 [94,] -0.028069039  0.168339168  0.030961054  2.420416e-02
 [95,]  0.014222393  0.083967690 -0.004867816 -5.239599e-02
 [96,]  0.013224502  0.035364955 -0.024373777 -1.265442e-01
 [97,]  0.014988695  0.048762971 -0.022848169 -6.958649e-02
 [98,]  0.025600678 -0.002949901  0.060168185 -1.134381e-02
 [99,] -0.036114394  0.125740036  0.003690927  1.233919e-01
[100,]  0.011912403  0.058022494 -0.003099780 -2.715868e-02
[101,]  0.100844491  0.001637929 -0.222682057 -1.541799e-02
[102,]  0.056383988  0.095609888 -0.086804408 -8.121243e-03
[103,]  0.104250205 -0.057191871  0.032454085  3.490115e-02
[104,]  0.078547178  0.029889157 -0.031761806 -1.256503e-01
[105,]  0.093625881  0.006695487 -0.083601717 -9.054452e-05
[106,]  0.135340402 -0.091605384  0.102070929 -5.962868e-02
[107,]  0.020766258  0.198358472 -0.159844858 -5.206972e-02
[108,]  0.116836322 -0.059120454  0.123104643 -1.364754e-01
[109,]  0.092479381  0.040549730  0.102031921 -4.174775e-02
[110,]  0.116205401 -0.130180070 -0.124011430  5.889132e-02
[111,]  0.066206246 -0.040283132 -0.071020173  6.422873e-02
[112,]  0.071848797  0.035861023  0.011028615  4.140030e-02
[113,]  0.086278694 -0.035967119 -0.009762671  8.652664e-02
[114,]  0.053632100  0.129186647 -0.082580332  7.452328e-02
[115,]  0.063184491  0.089743470 -0.184267186  1.748727e-01
[116,]  0.075874876 -0.019831660 -0.140505182  1.165394e-01
[117,]  0.077676977 -0.006975259 -0.012943849 -8.367203e-02
[118,]  0.138926727 -0.195528108 -0.039223023 -1.640837e-01
[119,]  0.151221171 -0.042793392  0.150502586  2.857279e-02
[120,]  0.051824453  0.126580905  0.101062482 -2.431636e-02
[121,]  0.096725966 -0.062894852 -0.064188583  9.839584e-02
[122,]  0.047769044  0.100794391 -0.149942412  3.234723e-02
[123,]  0.139439265 -0.076611144  0.167907401 -7.441030e-02
[124,]  0.055329415  0.033992070  0.018901228  8.651576e-02
[125,]  0.090654745 -0.055709695 -0.083824505 -3.203567e-02
[126,]  0.104147195 -0.093279163  0.060209066 -1.277272e-01
[127,]  0.050139846  0.029885314 -0.013430578  7.827116e-02
[128,]  0.051439605  0.019402260 -0.067744024  2.136672e-03
[129,]  0.084606059  0.034878486 -0.045165333  2.803219e-02
[130,]  0.095139712 -0.077270650  0.131684900 -1.228555e-01
[131,]  0.113214233 -0.062408111  0.146146675 -1.185258e-02
[132,]  0.128712301 -0.228526760  0.033555630 -1.341994e-01
[133,]  0.086033508  0.036133752 -0.061154862  6.802413e-02
[134,]  0.057536395  0.023849974  0.044888173 -1.013499e-01
[135,]  0.070968033  0.083134779  0.050641875 -2.682028e-01
[136,]  0.122569912 -0.114430203  0.098308052  1.644066e-01
[137,]  0.085428155 -0.023292993 -0.215274659  2.947269e-02
[138,]  0.075900444 -0.008198789 -0.047508907 -1.173787e-01
[139,]  0.046586780  0.027438253 -0.082560693  1.085780e-02
[140,]  0.083968704 -0.061912315 -0.007994635  1.117639e-01
[141,]  0.092197544 -0.030541623 -0.094529565  1.473337e-01
[142,]  0.076584495 -0.068051461 -0.033273937  2.681341e-01
[143,]  0.056383988  0.095609888 -0.086804408 -8.121243e-03
[144,]  0.102112248 -0.046209179 -0.085705007  3.073071e-02
[145,]  0.096364542 -0.050688629 -0.147782619  1.279321e-01
[146,]  0.077454696 -0.031187046 -0.052091894  2.261558e-01
[147,]  0.060843387  0.062416062  0.035708722  1.349771e-01
[148,]  0.070292769 -0.013114406 -0.038223151  7.269810e-02
[149,]  0.075734845 -0.019395493 -0.211868549  2.366397e-02
[150,]  0.055386098  0.047007153 -0.106310369 -8.226941e-02

$v
            [,1]        [,2]        [,3]       [,4]
[1,]  0.36138659 -0.65658877  0.58202985  0.3154872
[2,] -0.08452251 -0.73016143 -0.59791083 -0.3197231
[3,]  0.85667061  0.17337266 -0.07623608 -0.4798390
[4,]  0.35828920  0.07548102 -0.54583143  0.7536574
sigma <- X.svd$d
sigma # singular values
[1] 25.099960  6.013147  3.413681  1.884524

:::

  • Does R report the full or compact SVD?

R gives the compact SVD. But in this case the compact and full SVD are the same.

  • Check that \(\mathbf X\mathbf v= \sigma \mathbf u\). ::: {.callout-note appearance=“minimal” collapse=“true”} ### Solution
U <- X.svd$u
V <- X.svd$v

# check Xv = sigma U
t(X %*% V[,1]) - sigma[1] %*% U[,1]
             [,1]         [,2]          [,3]         [,4]         [,5] [,6]
[1,] 1.598721e-14 2.220446e-15 -1.332268e-15 1.776357e-15 1.332268e-15    0
             [,7]         [,8]         [,9]        [,10]        [,11]
[1,] 8.881784e-16 8.881784e-16 2.220446e-15 1.776357e-15 4.440892e-16
            [,12]        [,13]        [,14]         [,15]         [,16] [,17]
[1,] 1.332268e-15 8.881784e-16 2.220446e-15 -1.332268e-15 -8.881784e-16     0
            [,18] [,19]        [,20]        [,21]        [,22]        [,23]
[1,] 8.881784e-16     0 8.881784e-16 4.440892e-16 4.440892e-16 8.881784e-16
     [,24]        [,25]        [,26]        [,27]        [,28]        [,29]
[1,]     0 1.776357e-15 1.332268e-15 8.881784e-16 8.881784e-16 4.440892e-16
            [,30]        [,31] [,32]        [,33] [,34]        [,35]
[1,] 1.776357e-15 1.332268e-15     0 4.440892e-16     0 1.332268e-15
            [,36] [,37]        [,38]        [,39]        [,40]        [,41]
[1,] 4.440892e-16     0 8.881784e-16 2.220446e-15 8.881784e-16 8.881784e-16
            [,42]        [,43]        [,44]        [,45]        [,46]
[1,] 1.332268e-15 2.664535e-15 8.881784e-16 4.440892e-16 1.332268e-15
            [,47]        [,48]        [,49]        [,50]         [,51]
[1,] 4.440892e-16 8.881784e-16 8.881784e-16 4.440892e-16 -1.998401e-15
             [,52]         [,53]        [,54]         [,55]        [,56]
[1,] -9.992007e-16 -1.332268e-15 6.383782e-16 -8.881784e-16 4.440892e-16
             [,57]        [,58]         [,59]       [,60]        [,61] [,62]
[1,] -6.661338e-16 1.776357e-15 -1.332268e-15 1.13104e-15 1.332268e-15     0
             [,63]         [,64]        [,65]         [,66]        [,67]
[1,] -5.551115e-17 -1.110223e-16 3.608225e-16 -8.881784e-16 6.661338e-16
            [,68]         [,69]        [,70] [,71]         [,72]         [,73]
[1,] 2.220446e-16 -5.551115e-16 5.620504e-16     0 -2.220446e-16 -2.220446e-16
             [,74]         [,75]         [,76]         [,77]         [,78]
[1,] -1.110223e-16 -8.881784e-16 -8.881784e-16 -1.110223e-15 -1.776357e-15
             [,79]        [,80]        [,81]        [,82]        [,83] [,84]
[1,] -2.220446e-16 2.220446e-16 4.440892e-16 4.996004e-16 2.220446e-16     0
            [,85]         [,86]         [,87]         [,88]        [,89]
[1,] 7.771561e-16 -2.220446e-16 -1.998401e-15 -7.771561e-16 4.440892e-16
            [,90]        [,91]         [,92]        [,93]        [,94]
[1,] 6.383782e-16 8.881784e-16 -3.330669e-16 2.498002e-16 1.332268e-15
            [,95]        [,96]        [,97]         [,98]        [,99]
[1,] 4.996004e-16 3.330669e-16 3.330669e-16 -5.551115e-16 9.992007e-16
           [,100]        [,101]       [,102]        [,103] [,104]        [,105]
[1,] 3.330669e-16 -4.440892e-16 4.440892e-16 -1.332268e-15      0 -4.440892e-16
            [,106]       [,107]        [,108]        [,109]        [,110]
[1,] -2.220446e-15 2.220446e-15 -2.220446e-15 -4.440892e-16 -1.332268e-15
            [,111]        [,112]        [,113]       [,114]       [,115]
[1,] -1.110223e-15 -6.661338e-16 -1.332268e-15 4.440892e-16 4.440892e-16
            [,116]        [,117]        [,118]        [,119]        [,120]
[1,] -6.661338e-16 -6.661338e-16 -2.664535e-15 -2.664535e-15 -2.220446e-16
            [,121]       [,122]        [,123]        [,124]        [,125]
[1,] -1.332268e-15 6.661338e-16 -2.220446e-15 -4.440892e-16 -8.881784e-16
            [,126]        [,127]       [,128]        [,129]        [,130]
[1,] -2.220446e-15 -2.220446e-16 2.220446e-16 -8.881784e-16 -2.220446e-15
            [,131]        [,132]        [,133]        [,134] [,135]
[1,] -3.108624e-15 -1.776357e-15 -8.881784e-16 -4.440892e-16      0
            [,136]        [,137]        [,138]        [,139]        [,140]
[1,] -2.220446e-15 -1.332268e-15 -8.881784e-16 -2.220446e-16 -1.332268e-15
            [,141]        [,142]       [,143]        [,144]        [,145]
[1,] -8.881784e-16 -1.332268e-15 4.440892e-16 -1.332268e-15 -8.881784e-16
            [,146]        [,147]        [,148]       [,149]       [,150]
[1,] -1.332268e-15 -4.440892e-16 -6.661338e-16 2.220446e-16 6.661338e-16
X %*% V ==U %*% diag(sigma) # won't work
        [,1]  [,2]  [,3]  [,4]
  [1,] FALSE FALSE FALSE FALSE
  [2,] FALSE FALSE FALSE FALSE
  [3,] FALSE FALSE FALSE FALSE
  [4,] FALSE FALSE FALSE FALSE
  [5,] FALSE FALSE FALSE FALSE
  [6,]  TRUE  TRUE FALSE FALSE
  [7,] FALSE FALSE FALSE FALSE
  [8,] FALSE FALSE FALSE FALSE
  [9,] FALSE FALSE FALSE FALSE
 [10,] FALSE FALSE FALSE FALSE
 [11,] FALSE FALSE FALSE FALSE
 [12,] FALSE FALSE FALSE FALSE
 [13,] FALSE FALSE FALSE FALSE
 [14,] FALSE FALSE FALSE FALSE
 [15,] FALSE FALSE FALSE FALSE
 [16,] FALSE FALSE FALSE FALSE
 [17,]  TRUE FALSE FALSE FALSE
 [18,] FALSE FALSE FALSE FALSE
 [19,]  TRUE FALSE FALSE FALSE
 [20,] FALSE  TRUE FALSE FALSE
 [21,] FALSE  TRUE FALSE FALSE
 [22,] FALSE  TRUE FALSE FALSE
 [23,] FALSE FALSE FALSE FALSE
 [24,]  TRUE FALSE FALSE FALSE
 [25,] FALSE FALSE FALSE FALSE
 [26,] FALSE FALSE FALSE FALSE
 [27,] FALSE FALSE FALSE FALSE
 [28,] FALSE  TRUE FALSE FALSE
 [29,] FALSE  TRUE FALSE FALSE
 [30,] FALSE FALSE FALSE FALSE
 [31,] FALSE FALSE FALSE FALSE
 [32,]  TRUE  TRUE FALSE FALSE
 [33,] FALSE  TRUE FALSE FALSE
 [34,]  TRUE  TRUE FALSE FALSE
 [35,] FALSE FALSE FALSE FALSE
 [36,] FALSE FALSE FALSE FALSE
 [37,]  TRUE FALSE FALSE FALSE
 [38,] FALSE FALSE FALSE FALSE
 [39,] FALSE FALSE FALSE FALSE
 [40,] FALSE FALSE FALSE FALSE
 [41,] FALSE FALSE FALSE FALSE
 [42,] FALSE FALSE FALSE FALSE
 [43,] FALSE FALSE FALSE FALSE
 [44,] FALSE FALSE FALSE FALSE
 [45,] FALSE FALSE FALSE FALSE
 [46,] FALSE FALSE FALSE FALSE
 [47,] FALSE FALSE FALSE FALSE
 [48,] FALSE FALSE FALSE FALSE
 [49,] FALSE FALSE FALSE FALSE
 [50,] FALSE  TRUE FALSE FALSE
 [51,] FALSE FALSE FALSE FALSE
 [52,] FALSE FALSE FALSE FALSE
 [53,] FALSE FALSE FALSE FALSE
 [54,] FALSE FALSE FALSE FALSE
 [55,] FALSE FALSE FALSE FALSE
 [56,] FALSE FALSE FALSE FALSE
 [57,] FALSE FALSE FALSE FALSE
 [58,] FALSE FALSE FALSE FALSE
 [59,] FALSE FALSE FALSE FALSE
 [60,] FALSE FALSE FALSE FALSE
 [61,] FALSE FALSE FALSE FALSE
 [62,]  TRUE FALSE  TRUE FALSE
 [63,] FALSE FALSE  TRUE FALSE
 [64,] FALSE FALSE FALSE FALSE
 [65,] FALSE FALSE FALSE FALSE
 [66,] FALSE FALSE FALSE FALSE
 [67,] FALSE FALSE FALSE FALSE
 [68,] FALSE  TRUE FALSE FALSE
 [69,] FALSE FALSE  TRUE FALSE
 [70,] FALSE  TRUE FALSE  TRUE
 [71,]  TRUE FALSE  TRUE FALSE
 [72,] FALSE FALSE FALSE FALSE
 [73,] FALSE FALSE FALSE FALSE
 [74,] FALSE FALSE FALSE FALSE
 [75,] FALSE FALSE FALSE FALSE
 [76,] FALSE FALSE FALSE FALSE
 [77,] FALSE FALSE FALSE FALSE
 [78,] FALSE FALSE FALSE FALSE
 [79,] FALSE FALSE  TRUE FALSE
 [80,] FALSE FALSE FALSE FALSE
 [81,] FALSE FALSE FALSE FALSE
 [82,] FALSE FALSE FALSE FALSE
 [83,] FALSE FALSE FALSE FALSE
 [84,]  TRUE FALSE FALSE FALSE
 [85,] FALSE FALSE FALSE FALSE
 [86,] FALSE FALSE FALSE FALSE
 [87,] FALSE FALSE FALSE FALSE
 [88,] FALSE FALSE FALSE FALSE
 [89,] FALSE FALSE FALSE FALSE
 [90,] FALSE  TRUE FALSE FALSE
 [91,] FALSE FALSE FALSE FALSE
 [92,] FALSE FALSE FALSE FALSE
 [93,] FALSE FALSE FALSE FALSE
 [94,] FALSE  TRUE FALSE FALSE
 [95,] FALSE FALSE FALSE FALSE
 [96,] FALSE FALSE FALSE FALSE
 [97,] FALSE FALSE FALSE FALSE
 [98,] FALSE FALSE FALSE FALSE
 [99,] FALSE  TRUE FALSE FALSE
[100,] FALSE  TRUE FALSE FALSE
[101,] FALSE FALSE FALSE FALSE
[102,] FALSE FALSE FALSE FALSE
[103,] FALSE FALSE FALSE FALSE
[104,]  TRUE FALSE FALSE FALSE
[105,] FALSE FALSE FALSE FALSE
[106,] FALSE FALSE FALSE FALSE
[107,] FALSE FALSE FALSE FALSE
[108,] FALSE  TRUE FALSE FALSE
[109,] FALSE FALSE FALSE FALSE
[110,] FALSE  TRUE FALSE FALSE
[111,] FALSE FALSE FALSE FALSE
[112,] FALSE FALSE FALSE FALSE
[113,] FALSE FALSE FALSE FALSE
[114,] FALSE FALSE FALSE FALSE
[115,] FALSE FALSE FALSE FALSE
[116,] FALSE FALSE FALSE FALSE
[117,] FALSE FALSE FALSE FALSE
[118,] FALSE FALSE FALSE FALSE
[119,] FALSE FALSE FALSE FALSE
[120,] FALSE FALSE FALSE FALSE
[121,] FALSE FALSE FALSE FALSE
[122,] FALSE FALSE FALSE FALSE
[123,] FALSE FALSE FALSE FALSE
[124,] FALSE FALSE FALSE  TRUE
[125,] FALSE FALSE FALSE FALSE
[126,] FALSE FALSE FALSE FALSE
[127,] FALSE FALSE FALSE FALSE
[128,] FALSE FALSE FALSE FALSE
[129,] FALSE FALSE FALSE FALSE
[130,] FALSE FALSE FALSE FALSE
[131,] FALSE FALSE FALSE FALSE
[132,] FALSE  TRUE FALSE FALSE
[133,] FALSE FALSE FALSE FALSE
[134,] FALSE FALSE FALSE FALSE
[135,]  TRUE FALSE FALSE  TRUE
[136,] FALSE FALSE FALSE FALSE
[137,] FALSE FALSE FALSE FALSE
[138,] FALSE FALSE FALSE FALSE
[139,] FALSE FALSE  TRUE FALSE
[140,] FALSE FALSE FALSE FALSE
[141,] FALSE FALSE FALSE FALSE
[142,] FALSE FALSE FALSE  TRUE
[143,] FALSE FALSE FALSE FALSE
[144,] FALSE  TRUE FALSE FALSE
[145,] FALSE FALSE FALSE FALSE
[146,] FALSE FALSE FALSE  TRUE
[147,] FALSE FALSE FALSE FALSE
[148,] FALSE FALSE FALSE  TRUE
[149,] FALSE FALSE FALSE FALSE
[150,] FALSE FALSE FALSE FALSE
max(abs(X %*% V -U %*% diag(sigma)))
[1] 1.598721e-14

:::

  • Compute the best rank-1, rank-2, and rank-3 approximations to \(\mathbf X\), and report the 2-norm and Frobenious norm for these approximations ::: {.callout-note appearance=“minimal” collapse=“true”} ### Solution
norms <- data.frame()

# rank=1

Xr <-sigma[1] * U[,1]  %*% t(V[,1])
D=X-Xr
norms[1,1] <- sqrt(sum(diag(t(D)%*%D)))
norms[1,2] <- svd(D)$d[1]

for (rank in 2:4){
  # doesn't work with rank=1 
  # due to R treating vectors and matrices 
  #differently
  Xr <- U[,1:rank] %*% 
    diag(sigma[1:rank]) %*% t(V[,1:rank])
  D=X-Xr
  # Frobenius norm
  norms[rank,1] <- sqrt(sum(diag(t(D)%*%D)))
  norms[rank,2] <- svd(D)$d[1] #2 -rnorm
}

colnames(norms) <- c("Frobenius-norm", 
                     "2-norm")
norms
  Frobenius-norm       2-norm
1   7.166770e+00 6.013147e+00
2   3.899313e+00 3.413681e+00
3   1.884524e+00 1.884524e+00
4   3.691427e-14 3.576058e-14

:::

  • Compute the eigenvalues of \(\mathbf X^\top \mathbf X\). How do these relate to the singular values? How does \(\mathbf X^\top \mathbf X\) relate to the sample covariance matrix of the iris data? How do the singular values relate to the eigenvalues of the covariance matrix?
XtX.eig <- eigen( t(X) %*% X )
XtX.eig$values
[1] 630.008014  36.157941  11.653216   3.551429
svd(X)$d^2
[1] 630.008014  36.157941  11.653216   3.551429
cov(X)*149/150
             Sepal.Length Sepal.Width Petal.Length Petal.Width
Sepal.Length   0.68112222 -0.04215111    1.2658200   0.5128289
Sepal.Width   -0.04215111  0.18871289   -0.3274587  -0.1208284
Petal.Length   1.26582000 -0.32745867    3.0955027   1.2869720
Petal.Width    0.51282889 -0.12082844    1.2869720   0.5771329
t(X)%*%X/150
             Sepal.Length Sepal.Width Petal.Length Petal.Width
Sepal.Length   0.68112222 -0.04215111    1.2658200   0.5128289
Sepal.Width   -0.04215111  0.18871289   -0.3274587  -0.1208284
Petal.Length   1.26582000 -0.32745867    3.0955027   1.2869720
Petal.Width    0.51282889 -0.12082844    1.2869720   0.5771329
svd(X)$d/sqrt(150)
[1] 2.0494032 0.4909714 0.2787259 0.1538707
sqrt(eigen(cov(X)*149/150)$values)
[1] 2.0494032 0.4909714 0.2787259 0.1538707
svd(X)$d/sqrt(149)
[1] 2.0562689 0.4926162 0.2796596 0.1543862
sqrt(eigen(cov(X))$values)
[1] 2.0562689 0.4926162 0.2796596 0.1543862
  • Let \(\mathbf S\) be the sample covariance matrix of the iris dataset. What vector \(\mathbf x\) with \(||\mathbf x||=1\) maximizes \(\mathbf x^\top \mathbf S\mathbf x\)?
# obtained at first eigenvector, 
# or right singular vector

S<-cov(X)
eigen(S)$vectors[,1]
[1]  0.36138659 -0.08452251  0.85667061  0.35828920
V[,1]
[1]  0.36138659 -0.08452251  0.85667061  0.35828920
# Let's check by directly maximizing
xSx <- function(x){
  - t(x) %*%S %*%x / sum(x^2)
  # maximizing x^T Sx subject to x^T x=1
  # is equivalent to maximizing
  # x^T S x/ (x^T x)
  # the minus sign because `optim` minmizes
}
xSx(c(1:4))
          [,1]
[1,] -2.519568
out <- optim(c(1,1,1,1),xSx, )
out$par/sqrt(sum(out$par^2)) # normalise
[1]  0.36136956 -0.08456844  0.85666220  0.35831564
eigen(S)$vectors[,1]
[1]  0.36138659 -0.08452251  0.85667061  0.35828920

Task 3

Choose an few images from the USC-SIPI Image Database and repeat the image compression example from the notes. Which type of images compress well do you think?

library(tiff)
library(rasterImage)
Loading required package: plotrix
peppers<-readTIFF("../Notes_bookdown/Bookdown/figs/peppers.tiff")
plot(as.raster(peppers))

svd_image <- function(im,k){
  s <- svd(im)
  Sigma_k <- diag(s$d[1:k])
  U_k <- s$u[,1:k]
  V_k <- s$v[,1:k]
  im_k <- U_k %*% Sigma_k %*% t(V_k)
  ## the reduced rank SVD produces some intensities <0 and >1. 
  # Let's truncate these
  im_k[im_k>1]=1
  im_k[im_k<0]=0
  return(im_k)
}

par(mfrow=c(2,2), mar=c(1,1,1,1))

pepprssvd<- peppers
for(k in c(4,30,100,300)){
  svds<-list()
  for(ii in 1:3) {
    pepprssvd[,,ii]<-svd_image(peppers[,,ii],k)
  }
  plot(as.raster(pepprssvd))
}

# http://sipi.usc.edu/database/database.php?volume=misc&image=20#top
# I've used the black and white
# couple image
couple<-readTIFF("../Notes_bookdown/Bookdown/figs/Couple.tiff")
plot(as.raster(couple))

par(mfrow=c(2,2), mar=c(1,1,1,1))

couplesvd<- couple
for(k in c(4,30,100,300)){
  svds<-list()
  
  couplesvd<-svd_image(couple[,],k)
  
  plot(as.raster(couplesvd))
}

Task 4

We won’t discuss how the SVD is computed in practice, but there are a variety of approaches that can be used. Try the following iterative approach for computing the first singular vectors:

X <- as.matrix(iris[,1:4])
v <- rnorm(dim(X)[2])

for (iter in 1:500){
  u <- X %*%v
  u <- u/sqrt(sum(u^2)) 
  v <- t(X) %*%u
  v <- v/sqrt(sum(v^2)) 
}
X.svd <- svd(X)
X.svd$v[,1]/v
X.svd$u[,1]/u